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-Abstract- 

Linearizability, the de facto correctness condition for concurrent data structure implementations, 
despite its intuitive appeal is known to lead to poor scalability. This disadvantage has led research¬ 
ers to design scalable data structures satisfying consistency conditions weaker than linearizability. 
Despite this recent trend, sequential consistency as a strictly weaker consistency condition than 
linearizability has received no interest. 

In this paper, we investigate the applicability of sequential consistency as an alternative 
correctness criterion for concurrent data structure implementations. Our first finding formally 
justifies the reluctance in moving towards sequentially consistent data structures: Implementa¬ 
tions in which each thread modifies only its thread-local variables are sequentially consistent for 
various standard data structures such as pools, queues and stacks. We also show that for almost 
all data structures, and all the data structures we consider in this paper, it is possible to have 
sequentially consistent behaviors in which a designated thread does not synchronize at all. As 
a potential remedy, we define a hierarchy of quantitatively strengthened variants of sequential 
consistency such that the stronger the variant the more synchronization it enforces which at the 
limit is equal to that enforced by linearizability. 

Keywords and phrases Concurrency, Formal Specification, Data Structures, Sequential Consist¬ 
ency, Linearizability 

Digital Object Identifier 10.4230/LIPIcs.xxx.yyy.p 

1 Introduction 

The tension between performance and correctness is well known when it comes to developing 
low-level library routines implementing concurrent data structures M- On the one hand, 
scalability, the ability to fully utilize the parallelism offered by the underlying architecture and 
generally accepted to be the main determinant of overall performance, is adversely affected 
by the need to synchronize among threads. On the other hand, correctness criteria, usually 
known as consistency conditions, enforce lower bounds on the amount of synchronization. 

In order to break the impasse in favor of scalability research has focused on weakening 
the notion of correctness. The move towards alternative consistency conditions potentially 
leading to better performance was initially in the domain of memory implementations. The 
goal was to replace sequential consistency }13] (SC) with weaker memory consistency models 
(e.g. mEnniiHnii])- Recently a similar surge has been going on relative to linearizability m, 
which has hitherto been the criterion for correctness in the domain of concurrent data 
structures, being replaced with weaker notions of correctness (e.g. umm)- There have 
been already many scalable implementations benefiting from these weaker notions (e.g. mu 

unauMS]). 

In this paper, we investigate SC as an alternative relaxation of linearizability. Linearizab¬ 
ility requires that the effect of a method be globally visible (i.e. to other executing threads) 
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before it completes. SC relaxes this by requiring only thread local ordering. That is, if 
two methods m and m' are called by the same thread in that order, then the requirement 
by SC is that the effect of m appear to be before that of m !. Furthermore, because SC is 
generally accepted to be the most intuitive memory consistency model and concomitantly 
is very well understood and studied, it is surprising that it has received no consideration 
until now. Intriguing though it may be, we show that the apparent reluctance is actually 
warranted. 

Our investigation of SC ranges over five common data structures: two variants of pools 
(V, V ! ), queues (Q), stacks ( S ) and register (banks) (TV). We show that for all five of them 
it is possible to construct SC implementations where one thread, say t, can arbitrarily delay 
its synchronization with other threads as long as the sequence of local events of t satisfy 
a certain property, which we call robustness. Basically a sequence of events, method calls 
with return values, over a data structure T> is robust if the same sequence can be executed 
regardless of the state T> is in. For instance, enqueueing or pushing an element is a robust 
sequence (of length 1) in Q and S , respectively. 

We also show that for pool, queue and stack implementations being SC guarantees even 
less. We define the class of com.posable data structures to which pool, queue and stack belong. 
Intuitively a data structure is composable if for any pair of valid behaviors there exists an 
interleaving of this pair which is again a valid behavior. For composable data structures, 
an implementation in which threads do not synchronize at all is SC. To better understand 
the strength of such a result, consider a typical concurrency programming problem which 
contains N tasks to be generated by the producer threads and to be completed by the 
consumer threads. If the producers are to convey their tasks to the consumers over an SC 
queue (pool or stack), due to lack of synchronization the program will end with the queues 
of producers containing tasks and consumers having done nothing at all! 

As a possible remedy, we propose a natural modification to the definition of SC. We call 
an implementation k -SC if every thread has to synchronize at least once after executing k 
local events. This definition is strong enough to rule out the pathological implementations 
we consider in this paper. It also naturally spans the domain of implementations between 
SC and linearizability via a quantitative stratification. The smaller k is, the stronger k -SC is 
which at the limit reduces to linearizability (equivalent to 0-SC). 

To summarize, we make the following contributions: 
h Define the properties robustness and composability for data structures, 
h Prove that essentially broken SC implementations exist for data structures with either of 

these properties, 

h Span the range between SC and linearizability by bounding the non-synchronized event 

sequences. 

1.1 Related Work 

Directly related to our work, there have been two other work on relaxing the notion of 
linearizability. In [5], Henzinger et al. propose a framework in which it is possible to 
quantitatively relax any sequential data structure. Their framework enables one to define 
a desired metric and for any data structure V defines Dj~ to be all behaviors that are at 
most k away from some valid behavior of T>. A concurrent behavior is associated with 
the set of potential sequential witnesses, as defined by linearizability, and the concurrent 
behavior is correct if at least one potential witness belongs to In contrast, our work 
modifies the set of potential sequential witnesses, by using SC rather than linearizability, 
and the concurrent behavior is correct if at least one potential witness from this extended 
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set belongs to V. In m, Jagadeesan and Riely consider quiescent consistency (QC) as an 
alternative relaxation to linearizability. They quantitatively span the range between QC and 
linearizability. Similar to our work, their quantitative metrics is also defined over concurrent 
behaviors. Unlike us, they do not consider whether QC allows pathological cases. Neither 
work formally establishes the relation between synchronization and the characteristics of 
data structures as we do in this work. 

2 Notation 

For a set A, let Pow(A) denote the set of all subsets of A. Let As.O with x ranging over 
elements of A denote the function that maps all elements of A to 0. Let A\B\ denote the 
collection of functions from A to B. For / £ A[B\, f[a >->• b] denotes the function that agrees 
with / on A except for a which is mapped to b. A sequence a of length n over some alphabet 
A is denoted by a(l) • a (2) ... • a(n). Alternatively, we also use the notation (a(i)) iG [- l n ] to 
denote a. Let len(a) denote the length of a. For simplicity of presentation, unless stated 
explicitly to be otherwise, for every sequence a and for any 1 < i, j < len(a) we assume that 
a(i) 7 ^ a(j). Let last(a) denote the last symbol of a; i.e. last(a) = a(len(a)). Let Set (a) 
denote the set of symbols occurring in a; i.e. Set(a) = {a(i) | i £ [1 ,len(a)]}. Each sequence 
a induces a total order over Set (a), appears-before order < a , such that i < j iff a(i) < a a(j). 
Let A* denote the set of all sequences over A, with e denoting the empty sequence. 

A labelled transition system (LTS) is a tuple LTS = (Q, qo, L, —>), where Q is the set of 
states, qo is the initial state, L is a set of labels, and —>C (Q x L x Q) is a transition relation. 
We write q - 4 - q' if (q, l, q') £—>. A run r = qo ■ h ■ q\ ... l n ■ q n is an alternating sequence of 
states and labels such that for all i £ [l,n] we have qi-i qt. The trace of r, tr(r), is the 
sequence of labels occurring in r; i.e. tr(r) = (/(*))ie[i,n]- Let Tr(LTS) denote the set of all 
traces of LTS. 

2.1 Data Structures 

A data structure I? is a pair {D, E-p), where D is the data domain and £x> is the method 
alphabet. For all the data structures we consider in this paper, we take D to be the set of 
natural numbers, N, possibly augmented with a distinguished symbol NULL. An event of 
I? is a quadruple (id,m, di, d 0 ), where id £ N is an event identifier, m £ E-p is a method, 
di,d 0 £ D are input and output arguments, respectively. Intuitively, (id,m, di, d a ) denotes 
the application of method m with input argument di returning the output value d Q . When 
the input (resp. output) argument is not used in the event, we write (id,m, J~,d 0 ) (resp. 
(id, to, di, _L)). We will assume that each eveut has a unique event identifier. We will use Ex> 
to denote the set of all events of V. A duplicate-free sequence over Up is called a T>-behavior. 
The semantics of data structure I? is a set of P-behaviors, each of which is called a valid 
behavior. For each data structure T>, we will define a labelled transition system LTSp such 
that e £ Tr(LTSp) iff e is a valid 2 ?-behavior. Below we list the data structures that we will 
consider in this paper. 

2.1.1 Pool, V 

The method alphabet £p of a pool is the set {put,take}. Events of V are written as 
put* d (a:), short for (id, put, x, _L), and take ?d (a;), short for (id, take, _L, x). For conciseness, 
from this point on we will omit the superscript id. Events with put are called put events, 
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and those with take are called take events. We use Put and Take to denote the set of of all 
put and take events, respectively. 

LTSp is defined as (Pow(N), 0, E-p, —t-p), where —>-p is defined as: 

- q 1 - U t(x -> v q' iff q' = q u {a}, 

take(x) . . . r , 

h q - >-p q‘ if± x £ q and q = q\ {xj, 

_ q - >-p q lfl q = q = It). 

2.1.2 Pool with Membership, V’ 

The method alphabet E-p? is Ep U {mem}. An event of V ! is either a V event or of the form 
mem(:r, y), short for (id, mem, cc, y). Events with mem are called query events, and Mem denotes 
the set of all query events. 

LTSp? is defined as (Poio(N),0, E-p?, — >-p?), where — >-p? is defined as: 

X / , r X / 

- q -tp? q if q ~>v Q , 

mem (x,y) . ._ , , 

h q ->-p? q in q = q , and either y = x and xGg, or y = x + 1 and x q. 


2.1.3 Queue, Q 

The method alphabet Eg is the set {enq, deq}. Events of Q are written as enq(a;), short 
for (id, enq, x, _L), and deq(cc), short for (id, deq, _L, x). Events with enq are called enqueue 
events, and those with deq are called dequeue events. We use Enq and Deq to denote the set 
of all enqueue and dequeue events, respectively. 

LTSg is defined as (N*,£,Eq,—>q), where — >g is defined as: 

/ ■rr / 

- q - >Q q in q =q-x, 

deq (xi 

h q - >q q' and x ^ NULL iff q = x ■ q', 

deq(NULL) , , 

- q ->g q lit q = q = e. 


2.1.4 Stack, S 


The method alphabet Eg is the set {push,pop}. Events of S are written as push(a:), short 
for (id, push, x, _L), and pop(s), short for (id, pop, _L, a:). Events with push are called push 
events, and those with pop are called pop events. We use Push and Pop to denote the set of 
all push and pop events, respectively. 

LTSs is defined as (N*, e, Eg, — >5), where —>5 is defined as: 


push(z) 

q - q iff q = q-x, 

q v ° v ^- x \ s q' and x ^ NULL iff q = q' ■ x, 

pop(NULL) , , 

q - >-5 q iff q = q =s. 


2.1.5 Register, 7 Z 

The method alphabet E-p is the set {wr, ; , rd, | i £ N}. Events of 1Z are written as wr,(;r), 
short for (id, wr,, x, _L), and rd,;(:r), short for (id, rd i; _L, x). Events with wr, are called write 
events, and those with rdj are called read events. We use Wr and Rd to denote the set of all 
write and read events, respectively. 

LTSp. is defined as (N[N], Ax.O, E-p, — j-p), where is defined as: 

■ q - q lit q = q[i x\. 

h q q' iff q' = q and q(i) = x. 
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2.2 Histories 

Each event e = ( uid,m,di,d 0 ) generates two actions: the invocation of e, written as inv(e), 
and the response of e, written as res(e). We will also use mf ld (di) and mf ld (d 0 ) to denote 
the invocation and response actions, respectively. When a particular method m does not have 
an input (resp. output) parameter, we will write mf ld (resp. m“* d ) for the corresponding 
invocation (resp. response) action. We will also often omit the superscripts, when they 
are not important. For an event set E, let Ei and E r denote the set of all invocation and 
response actions generated by E. 

A T>-history is a sequence of invocation and response actions generated by Ex>. The unique 
identifier of each event unambigiously pairs each invocation action to a unique response 
action; in such a case, the actions are said to match. We will make use of this pairing without 
explicitly referring to event identifiers when there is no confusion. Similarly, we will omit 25 
whenever it is either inconsequential or clear from the text. A history h is well-formed if every 
response action appears after its matching invocation action in h. An event e is completed 
in h, if both of its invocation and response actions appear in h. Formally, e is completed 
if ei, e r £ Set( h). A history h is complete if for all events e, a £ Set(h) iff e r £ Set(h). In 
what follows we will consider only we 11-formed and complete histories. 

An event e precedes another event e' in h, written e -<h e! , if the response action of e 
appear before the invocation action of e'\ i.e. e r <h e'. A history is called sequential if 
all invocation actions are immediately followed by their matching responses. Formally, the 
(complete) history h is sequential if it is of the form ei ;l • ei >r • ... • e Ut i ■ e n>r . We identify 
sequential 25-histories with 25-behaviors by mapping each matching pair of invocation and 
response actions to the event generating them. A sequential history s is a linearization of a 
history h, if s is a permutation of h such that e -<h e! implies e -< s e!. 

► Definition 1 (Linearizability). A 25-history h is linearizable if there exists a linearization of 
h that is a legal 25-behavior. A set H of histories is linearizable if every h £ 22 is linearizable. 

Let T be the set of thread id,' s. A threaded 25-action is of the form (f, e), where t £ T 
and e £ Ehy,i U Er>,r- For a threaded action a = (t,a ), we use tid and act to retrieve the 
first and second components, respectively; i.e. tid(a) = t and act(a) = a. Similarly, tid 
and act are point-wise extended over sequences of threaded-actions. For a sequence of 
threaded-actions h and a thread id t £ T, let h denote the subsequence obtained by 
removing all threaded-actions from h whose first component is not equal to t. A threaded 
25-history is a sequence h over threaded actions such that 
h the sequence act( h) is a complete 25-history, 
b the sequence act( h J, t ) is sequential for any t £ T. 

The second condition implies that at any point in a threaded history any thread t £ T 
can have at most one unmatched action. For ease of presentation, we will use h and h (t) 
to denote act( h) and act{ h 4-t), respectively. We will extend the properties of histories to 
threaded histories: a threaded history h is said to satisfy property P if h satisfies P. 

► Definition 2 (Sequential Consistency). A threaded 25-history is sequentially consistent if 
there is a sequential threaded 25-history s such that s is a permutation of h and for all t £ T 
we have s(t) = h(£). 

Intuitively, for a threaded history h to be sequentially consistent (SC) only the relative 
ordering per thread has to be respected. Since by the definition of threaded-histories, if (t, e) 
and (t, e!) are both in h, then either e -<h e' or e! -<h e, linearizability must preserve the 
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relative ordering among events done by the same thread. In other words, linearizability is a 
stronger condition than sequential consistency. 

► Fact 3. Let h be a threaded P-history. It is sequentially consistent if it is linearizable. 

As far as the implementation of data structures is concerned, we will not specify a 
particular programming language. We will assume that each method in the method alphabet 
has an accompanying procedure. An execution trace is a sequence of instruction labels 
coupled with thread identifiers executing the instruction. For instance, (t : i) denotes the 
execution of instruction with the unique label i by thread t. An instruction label is the entry 
point of method m, written enter (to), if it is the label of the first instruction of to. Similarly, 
an instruction label is an exit point of to, written exit^m), if it is the lable of an instruction 
that completes the execution of to. Each execution trace r induces a history h(r) which is 
obtained by replacing each (t : enter(m)) with m t i aid (di), each (t : exit(m)) with m t r uid {d 0 ), 
and removing the remaining (intermediate) symbols. We assume that states of an execution 
trace contain enough information to deduce the values of di and d 0 associated with each 
entry and exit point. An execution trace is complete if its induced history is complete. For an 
execution trace r and some t £ T, let r J-t denote the execution trace obtained by retaining 
only the symbols due to t (symbols of the form [t : i). An implementation is identified with 
the set of execution traces it generates. When clear from the context we will refer to the 
induced history of an execution trace as a history of the implementation. 

3 Properties of Data Structures 

Our main result is that sequential consistency is too weak because the class of SC imple¬ 
mentations includes bad ones. In order to generalize our result, we abstract away irrelevant 
specifics of data structures and extract what seems to be the essential property that causes 
SC implementations to misbehave. We identify two properties: composability and robustness. 
A data structure is composable if any two valid behavior can be interleaved in such a way 
that the result is also a valid behavior. Robustness means that the data structure has events 
that are state independent. In this section, we formalize these notions. 

► Definition 4 (Composable). Let e and f be two valid P-behaviors. They are called 
composable if there exist two partitionings e = ei... e*, and f = fi... ffc such that the 
behavior ... e^fj, is a valid P-behavior. The data structure P is called composable if any 
pair of valid P-behaviors are composable. 

Informally, two valid P-behaviors are composable if it is possible to interleave them to obtain 
another valid P-behavior. We now proceed with a series of results, establishing composability 
of the data structures we consider in this paper. 

► Lemma 5 (Pool and Composability). The pool data structure V is composable. 

Proof. Let e and f be two valid P-sequences. Let ei and fi be the maximal prefixes of e and 
f, respectively, such that last(e i) = take(NULL) and last(f\) = take(NULL). Let e2 and f2 be 
the remaining suffixes of e and f. That is, e = eie2 and f = fif2. We claim that g = eifie2f2 
is a valid P-behavior. This is equivalent to showing that there is a run of LTSp whose trace 
is g. By the validity of e, we know that there is a run rq of LTS-p with trace ei because 
Tr(LTSp) is prefix-closed. Furthermore, by the assumption that last(e i) = take(NULL), that 
run ends at the state 0. Similarly, there is a run r 2 with trace fi. Since rq ends at 0, rir 2 is 
also a run of LTSp whose trace is eifi. Since the trace e2 belongs to a run r3 (implying that 
rir3 is the run with trace e) which starts at 0, iqr2r3 is a run of LTSp. Finally, we have to 
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extend rq^rs with a run r 4 whose trace is f 2 . We proceed by induction on the length of 
f 2 - The base case, f 2 = e is trivial. Assume that the next event is put (a;). By the definition 
of LTS-p, such a transition is enabled at all states. Assume that the next event is take(;r) 
(note that by construction, f 2 does not contain a transition with label take(NULL)). Because 
the run with trace f 2 starts at the state 0 , there must have been an event put (a:) in f 2 and 
that no event since the last occurrence in f 2 prior to take(:r) was equal to take(:r). In other 
words, we must have x £ q, where q is the current state, which means that it is possible to 
extend the run with take (a;). ◄ 

► Lemma 6 (Stack and Composability). The stack data structure 5 is composable. 

Proof (Sketch). The construction is similar to the one given in the proof of V. For any two 
valid 5-behaviors e and f, we take the maximal prefixes ei and fi both of which end with 
pop(NULL). Then, the composed behavior eifie 2 f 2 is also a valid 5-behavior, where e 2 and 
f 2 are the remaining suffixes of e and f. Intuitively, the constructed behavior is valid because 
neither e 2 nor f 2 reaches beyond what they have pushed onto the stack (no appearance of 
pop(NULL) in either of the two). ◄ 

► Lemma 7 (Queue and Composability). The queue data structure Q is composable. 

Proof (Sketch). Unlike the previous two cases, the construction for queue is more involved. 
We will need to partition a valid Q-behavior according to the relative ordering between the 
enqueue events of Enq and the dequeue events of Deq. For any enq(a;), let deq(x) be called 
its observer. Given a valid Q-behavior e, we define Eraj(e) inductively as follows: 
h Er 03 (e) = e. 

h Era ]+I (e) the maximum segment that begins with an enqueue event whose observer is 
in Eraj (e) and extends until the first element that belongs to Eraj (e). 

To illustrate the definition, consider the valid Q-behavior: 

e = enq(l) • enq(2) ■ deq(l) • enq(3) ■ deq(2) • enq(4) • enq(5) ■ deq(3) • deq(4) 


Then, we have 



Era\(e) = 

enq(5) 

■ deq(3) • deq(4) 

Era 2 (e) = 

enq(3) 

■ deq(2) • enq(4) 

Er 03 (e) = 

enq(2) 

■ deq(l) 

Er 04 (e) = 

enq(l) 



Observe that by construction all enqueue events of Eraj(e) for j > 1 are observed by the 
dequeue events of Eraj- i(e). By convention, for any n > k such that Erak(e)... Era\(e) = 
e, we set Era n (e) = e. 

Now let e and f be two valid Q-behaviors. Let ei be the maximal prefix of e such that 
the run with trace ei, which necessarily exists, ends at state e (the initial state of LTSq). 
Let e r be the remaining suffix of e; i.e. e = eie r . Let fi and f r be defined similarly for f. 
Now construct the Era sequences for e r and f r . Let j e be the maximal index of a non-empty 
era sequence of e, jf be the index for f. Without loss of generality, assume that j e > jf. 
Then, the interleaving 

ei • fi • Era ej (e) • Era ej (f)... Era\(e) ■ Erai(f) 
is a valid Q-behavior. To illustrate the construction, consider another valid Q-behavior 
f = enq(6) ■ deq(6) • enq(7) • enq(8) ■ deq(7) • enq(9) • enq(10) • deq(8) 
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Then, fi is enq(6) • deq(6) and the era sequences for the suffix f r is 

Era\(J) = enq(9) • enq(10) • deq(8) 

Era, 2 {f) = enq(8) ■ deq(7) 

Era 3 (i) = enq(7) 


Finally, the interleaving for e and f is given as: 

enq(6) ■ deq(6) • enq(l) • enq(2) ■ deq(l) • enq(7) • enq(3) ■ deq(2)- 

enq(4) • enq(8) • deq(7) • enq(5) ■ deq(3) • deq(4) • enq(9) • enq(10) ■ deq(8) 

which is a valid Q-behavior. Intuitively, the construction works as each era sequence of 
one behavior (say e) with index j removes all the elements inserted by the most recent 
era sequence of the same behavior with index j + 1 , thereby making the insertions of this 
behavior (e) invisible to the other behavior (f) and vice versa. ◄ 

The other two remaining data structures, V' and 7Z, are not composable. For the pool with 
membership data structure V', the following two valid ^-behaviors have no interleaving 
that is a valid T^-behavior: 

e = put(l) • mem(2, 3), f = put(2) ■ mem(l, 2) 

since mem(2,3) has to come before put(2) and mem(l,2) has to come before put(l), both 
conditions of which cannot be simultaneously satisfied. 

Similarly, for the register data structure 7 Z 1 the following valid 7?.-behaviors are not 
composable: 

e = wr(l, 1) • rd(2, 0), f = wr(2,1) • rd(l, 0) 

The next property, robustness, is satisfied by all five of the data structures we consider and 
to the best of our knowledge all data structures are robust. 

► Definition 8 (Robustness). Let e be a valid P-behavior. It is robust if for any state 
q £ LTS-p, there is a run that starts at q with trace e. A data structure is robust if it contains 
at least one robust sequence. 

In general, any data structure which contains a total event (e.g. enq(x) of Q or push(x) of 
S) is robust. 

► Lemma 9. All ofV, V 1 , Q, S, 1Z are robust. 

Proof. The event take(x) is enabled at every state of LTS-p and LTSp?. The event enq(x) 
is enabled at every state of LTSg. The event push(a;) is enabled at every state of LTS 5 . The 
event wr(a :,y) is enabled at every state of LTSp. ◄ 

Robust sequences can contain arbitrary events or be restricted to a subset of events, depending 
on the data structure. 

► Lemma 10 . A robust sequence of V and V ’ cannot contain put(NULL); for Q, it cannot 
contain deq(NULL); for S , it cannot contain pop(NULL). There is no restriction on robust 
sequences for 7 Z. 


A. Sezgin 


9 


4 SC is too Weak 

In this section, we present the bad implementations that SC seems to allow. These bad 
implementations come in two variants: conditional and unconditional non-synchronization. 
We show that all robust data structures allow conditional non-sychronization. Unconditional 
non-synchronization, arguably the worst of the two, is allowed by composable data structures 

(V , Q, S). 

Let us call a label l enabled at state q if there exists a state q' such that q -4 q 1 . 

► Definition 11 (Initialized). Let (Q,qo, L,— ►) be an LTS. A state q £ Q is subsumed by 
another state q' £ Q if for all l £ L, l is enabled at q implies l is enabled at q'. The LTS is 
called initialized if its initial state g 0 is not subsumed by any other state. 

We call an LTS ( Q , go, L, —►) non-trivial if there is at least one state q 7 ^ go and one label 
l £ L such that go -4 g. A data structure P is non-trivial if LTSp is non-trivial. 

► Lemma 12. The LTS corresponding to T,V ? , Q,S,1Z are initialized. 

Proof. In all cases, there are transitions which are only enabled at the initial state. The 
transitions for all except for LTS 77 are take(NULL) in LTSp and LTSp?; deq(NULL) in LTSg; 
pop(NULL) in LTS^. For LTS 77 , let g' 7 ^ g 0 be some state. By definition, there must be at 
least one x £ N such that q'{x) 7 ^ 0 since otherwise q' and g 0 = Ax.O are identical. Then, 
rd(x, 0 ) is enabled at go but not at g'. ◄ 

For each data structure T> , we distinguish an implementation lmp iso (P), called the isolated 
implementation of T> , whose induced histories are thread-locally valid P-behaviors. Formally, 
for any execution trace r of lmp iso (P) and for any t £ Tid , the induced history of r is 
a valid P-behavior. Intuitively, isolated implementations are those which do not need any 
communication between threads. For most data structures, such implementations are not 
desirable and as the following result shows are ruled out by linearizability. 

► Lemma 13. IfV is initialized and non-trivial, then lmp iso (P) is not linearizable. 

Proof. Let LTSp be the tuple (Q, go, E, —►). Because P is non-trivial, there is a transition 
go A g for some e = (m, di , d a ) £ E and g 7 ^ go- Because P is initialized, there is an event 
e! £ E such that e! = (rn\ d'i, d' a ) is enabled at go and not at g. Then consider the history 
h for t,t' £ T: 

h d = ( t,mi(di )) • ( t,m r {d 0 )) • • (t 1 ,m' r (d ' Q )) 

The only linearization for this history is ( m,di,d Q ) • (m', d[, d' 0 ). Since this is not a valid 
P-behavior, h is not linearizable. However, because both ( m,di,d 0 ) and (m 7 , d', d' Q ) are 
individually valid P-behaviors, h is induced by some execution trace of lmp iso (P), implying 
that the latter is not linearizable. ◄ 

The following result immediately follows from Lemma’s [12] and [l3j 

► Corollary 14. The isolated implementations of V, P ? , Q, S and TZ are not linearizable. 

The previous result shows that the definition of linearizability is strong enough to leave out 
these pathological implementations. As we show next, sequential consistency is weak enough 
to allow for isolated implementations of some data structures. 
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SC Data Structures 


► Theorem 15 (SC and Isolated Implementations). If V is composable, then lmp iso (P) is 

sequentially consistent. 

Proof. Let r be an execution trace of lmp iso (P) and let h be the induced threaded-history. 
We do induction on the number of threads that execute at least one method during r. If h is 
due to a single thread, then it is a valid P-behavior and hence SC. Assume that if h has less 
than or equal to k different threads, it is SC. Now let h have k + 1 different threads and let 
t 7 ^ t! be the identifiers of two of those. By the definition of lmp iso (2?), both h(t) and h(f') 
are valid P-behaviors. By composability, there is an interleaving h' of h(t) and h(f') which 
is also a valid P-behavior. Let u £ T be a thread identifier that does not appear in h and let 
h' denote the threaded-history obtained by coupling each symbol of h' with u. That is, h' 
is the sequence (u, h'(i))j e [ liien ( h q]. Let g be the threaded history which is constructed by 
first projecting out from h all symbols s with act(s) = t or act(s) = t! , and then extending it 
with h'. By construction, g has k different threads and each g (t) is a valid P-behavior. By 
definition, there is an execution trace t' which induces g. By inductive hypothesis, there is a 
sequential threaded valid P-history s such that for all t £ T we have s (t) = g (f). Finally, 
because h' was an interleaving of h(t) and h(t'), replacing a in s with the original t or t' 
identifiers yields the valid sequential threaded P-history corresponding to h. ◄ 

This implies that the definition of sequential consistency is not strong enough for composable 
data structures. 

► Corollary 16. The isolated implementations ofV, Q and S are sequentially consistent. 

An execution trace r of a P implementation is called t-singular if h{r) f* is linearizable 
and h{r) \. t is a valid P-behavior. Let lmp sirag (P), the singular implementation of P, denote 
the union of all linearizable execution traces and all t-singular execution traces. We next 
show that singular implementations of robust data structures are non-linearizable. 

► Lemma 17. //P is robust, initialized and non-trivial, then \mp sing (T>) is not linearizable. 

Proof. Let LTSp be the tuple (Q,qo,E,-+). Let e = (e(z)) ie [i.nl be a robust sequence. Let 
qo —A q± ... q n be the run whose trace is e. If q n ^ qo, because P is initialized, there 
must be some event e' £ E enabled at qo but not at q n . Then, similar to the proof of 
Lemma |13| the threaded history in which thread t £ T runs all actions associated with h 
followed by another thread u £ T running the two actions generated by the event e is in 
\mpsingip) because e represents a valid P-behavior and e represents a robust sequence. If 

Qn = Qo, then there must be a state q' ^ qo and a label e! such that qo q' holds. The rest 
of the argument is the same as the previous one. ◄ 

The following is immediate from Lemma [9] 

► Corollary 18. The singular implementations of V , T ! , Q, S and 1Z are not linearizable. 

Intuitively, in singular implementations the behavior of some thread t can be hidden from 
the rest of the system as long as the sequence generated by t remains robust. This, although 
arguably not as bad as being isolated, is still an undesirable feature and linearizability forbids 
it. We now show that singular implementations are sequentially consistent. 

► Theorem 19 (SC and Singular Implementations). For any data structure P, Imp sing (V) is 
sequentially consistent. 
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Proof. Let r be an execution trace of lmp sing (2?) and let h be the induced threaded history. 
If h is linearizable, then by Fact. [ 3 ] it is also sequentially consistent. If h is not linearizable, 
then there must be some thread t £ T such that h(r) \. t is robust. Furthermore, we know 
that h{r) f 4 is linearizable. Assume that s is the linearization of h(r) j 4 . Then the sequence 
s' formed by appending h(r) 4_ t to s is a valid 22-behavior. Since for all u £ T we have 
s' j, u = h(r) j, u , we conclude that h is sequentially consistent. ◄ 

We end this section by giving templates for isolated and singular implementations. Assume 
that we already have a sequential implementation of any data structure and use the notation 
Qbjp to denote the class implementing the methods of V. For instance, for the Q data 
structure, Ob j g implements the required methods which are called by appending the method 
to an object O of type Objg as in O.enq(x). In our programs, we assume that each thread 
t. £ T has its thread-local copy of type Ob j v and use the notation Ob j [f] to denote the object 
exclusively used by t. Then, in an isolated implementation, each method m £ E-p has the 
following template: Here self evaluates to t whenever the method is run by thread t. 

m(di) { d 0 = Obj[self].m(di); return d a ; } 

As for singular implementations, we use the tem¬ 
plate given in Fig. [I] Intuitively, there is one non- 
deterministically assigned thread id ( i ) which each thread 
checks whether is equal to its own. There is a local object 
for each thread, like the isolated implementation template 
explained above. Additionally, there is another object, 0, 
visible to all threads. If the thread with identifier t / j 
invokes a method m with input di , then it applies m(di) 
on 0 atomically (e.g. performing the operation only after 
acquiring a global lock and releasing upon completion). 
Otherwise, if t = i, then the thread checks whether the 
sequence it has locally performed so far (kept in the thread- 
local variable lseq) is robust. If not, it proceeds like other 
threads, atomically applying the method. If the sequence 
so far has been robust, the result of applying m{di) to it 
is checked again for robustness. If appending m(di , d 0 ) to 
lseq leaves it robust, lseq is updated and d 0 which is the result of applying m(di) to V after 
lseq is returned. Otherwise, the sequence up to now is atomically applied to 0 and t becomes 
fully synchronized. 

5 From SC to Linearizability - Forced Synchronization 

The previous section showed that the definition of sequentially consistency is too weak. If 
it were to be taken as is as the correctness criterion, certain broken implementations, such 
as the isolated or singular implementations, would be correct. We also know that the same 
implementations are not linearizable. On the one hand the synchronization required for 
achieving linearizable data structures is also the culprit for non-scalable implementations. On 
the other hand complete disregard for synchronization allowed by sequential consistency leaves 
us with pathological implementations. In this section we propose a way to quantitatively 
bridge the gap from sequential consistency to linearizability. 


Event d 0 = m(di ) 
if self=j then 

d 0 <— Obj[self].m(di)i 

newseq <— lseq-?n(di,d 0 ); 
if notRobust(lseq) then 
atomic <commit(lseq)>; 
atomic <d 0 <— 0.m(di)>; 
lseq <— s; 
else 

lseq <— newseq; 

else 

atomic <d 0 <— O .m{di)>\ 

return d 0 \ 


Figure 1 The template for 
m(di,do) in lmp sinfl . 
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Our idea is to limit the number of consecutive (total) events that a thread can execute 
before being forced to synchronize. Let h be a threaded P-history. For any t £ T, let ej be 
the i th event of t if h(t)(i) = ej. For any SC threaded l?-history h, let us call a sequential 
threaded 25-history s a serialization of h if for all t £ T, s(t) = h(t). 

► Definition 20 (fc-serial). Let a threaded 25-history h be SC. Then, h is fc-serial if there 
exists a serialization s of h such that for any t £ T, e, e' £ h whenever e! is the i th event of t 
and e! -<h e, then for all j < * — fc we have h(f)(j) < s e. An implementation is fc-SC if all its 
traces are fc-serial. 


Informally, a threaded history is fc-serial if a thread cannot continue execution for more than 
fc events without synchronizing with other threads. In SC proper, since there is no explicit 
requirement for synchronization, for any fc £ N one can construct a threaded 25-history such 
that it is not fc-serial as long as 25 has at least one robust sequence. In linearizability, this 
bound is by definition 0; i.e. h is 0-serial iff h is linearizable. We state these results formally. 

► Lemma 21. Let 25 contain at least one robust sequence. 

h For any k £ N, there exists a sequence which is k + 1-serial but not k-serial. 
h If h is linearizable, then it is 0-serial. 


Event d Q = m{di) 


if self—?' then 

d 0 <— Qbj[self].m(d;); 
newseq <— \seq-m(di,d 0 )', 
if notRobust(lseq) 

then 


V cnt > k 


atomic < commit (lseq)>; 
atomic <d 0 e— O .m{di)>\ 
lseq ■ 


e; 


else 

lseq 

else 

atomic <d 0 

return d 0 : 


cnt=0; 


newseq; 


cnt++; 


0.m(di)>; 


It is straightforward to implement fc-SC data struc¬ 
tures by modifying the singular implementations given in 
the previous section (modifications shown by the boxed 
code of Fig. [2]). Events are performed locally without syn¬ 
chronization as long as the the sequence so far has been 
robust and its length is less than fc (the additional dis¬ 
junct cnt > fc ). Once either of the conditions is violated, 
the effects of all events seen so far are committed to the 
shared data structure. Until then, the local sequence and 
its length is updated (the increment cnt+-l- )■ Observe 
that if fc is taken to be 0, the additional disjunct will 
always evaluate to true, forcing synchronization at each 
call, thereby guaranteeing linearizability. 


Figure 2 The 

m(di, d 0 ) in fc-SC. 


template for 


6 Conclusion 


We have shown that sequential consistency despite its 
appeal is too weak to be used as an alternative to lineariz¬ 
ability in specifying concurrent data structure correctness. 
For almost all well-known data structures, sequentially 
consistent implementations thereof can have undesirable behavior. For instance, it is possible 
for a thread in a sequentially consistent queue implementation to observe the queue as empty 
regardless of what the other threads are doing. 

As a first step to bridge the gap between sequentially consistent and linearizable imple¬ 
mentations, we also propose a quantitative constraint to capture implementations that lie 
between the two consistency conditions. In a fc-SC implementation, a thread is allowed to 
proceed without synchronization only for a determined number of consecutive events after 
which it is required to synchronize. 

One possible future work is the development of concrete data structures that are fc- 
SC and investigate the relation between particular values of fc and some notion of overall 
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progress. Another possibility is to check whether a similar strengthening of other consistency 

conditions either weaker than (memory models of modern processors, such as x86 or ARM) 

or incomparable to (e.g. quiescent consistency) sequential consistency is useful. 
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